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Abstract 

We describe a method for automatically 
generating Lexical Transfer Rules (LTRs) 
from word equivalences using transfer rule 
templates. Templates are skeletal LTRs, 
unspecified for words. New LTRs are cre- 
ated by instantiating a template with words, 
provided that the words belong to the ap- 
propriate lexical categories required by the 
template. We define two methods for cre- 
ating an inventory of templates and using 
them to generate new LTRs. A simpler 
method consists of extracting a finite set 
of templates from a sample of hand coded 
LTRs and directly using them in the gen- 
eration process. A further method con- 
sists of abstracting over the initial finite 
set of templates to define higher level tem- 
plates, where bilingual equivalences are de- 
fined in terms of correspondences involving 
phrasal categories. Phrasal templates are 
then mapped onto sets of lexical templates 
with the aid of grammars. In this way an 
infinite set of lexical templates is recur- 
sively defined. New LTRs are created by 
parsing input words, matching a template 
at the phrasal level and using the corre- 
sponding lexical categories to instantiate 
the lexical template. The definition of an 
infinite set of templates enables the au- 
tomatic creation of LTRs for multi-word, 
non-compositional word equivalences of any 
cardinality. 



1 Introduction 

It is well-known that Machine Translation (henceforth 
MT) systems need information about the different ways 



in which words can be translated, depending on their 
syntactic and semantic context. Such lexical transfer 
rules (henceforth LTRs) are notoriously time-consu- 
ming for humans to construct. Our task is to auto- 
matically generate LTRs. 

An LTR can be seen as a word equivalence plus 
an associated transfer pattern. By word equivalence 
we mean a translation pair simply stated in terms of 
words, in a dictionary-like fashion. A transfer pattern 
specifies how transfer is to be performed for each of 
the morphological variants of the words in the equiv- 
alence and for different syntactic contexts. For in- 
stance, given an English- Spanish word equivalence 

get lucky <-> tener suerte 

the associate transfer pattern would have to account 
for all the following equivalences: 



I get lucky 

I will get lucky 

I would have got lucky 

Getting lucky 

I start getting lucky 

I start getting very lucky 



Tengo suerte 
Tendre suerte 
Habria tenido suerte 
Teniendo suerte 
Empiezo a tener suerte 
Empiezo a tener mucha 
suerte 



where the Spanish sentences can be glossed as follows: 

a. Tengo suerte 

I have good luck 

b. Tendre suerte 

I will have good luck 

c. Habria tenido suerte 

I would have had good luck 

d. Teniendo suerte 
Having good luck 

e. Empiezo a tener suerte 

I start to have good luck 

f. Empiezo a tener mucha suerte 

I start to have much good luck 



The last example in the list of translation pairs 
shows that a transfer pattern also has to account for 
modifiers. In the example, very is translated by the 
adjective mucha, whereas in a sentence like / start 
getting very lazy it would be translated by the adverb 
muy (Empiezo a volverme muy perezoso). 

Thus, a bilingual lexicon of such transfer rules is a 
different object from a collection of word equivalences, 
which is the definition of a bilingual lexicon most of- 
ten found in the literature about automatic creation 
of bilingual lexicons (||, Q, to cite recent examples). 
As a matter of fact, those techniques and the one de- 
scribed here are disjoint and complementary, as the 
output of those tools can be used as input to LTR 
development. Given a collection of word equivalences, 
we focus on how transfer patterns can be associated 
with them to create complete LTRs. 

This task has rarely been tackled before. Several 
techniques have been proposed for the automatic ac- 
quisition of word equivalences (see references above), 
but very few for the automatic acquisition of full LTRs 
(e.g. |l[), despite the high cost of their manual de- 
velopment. Bilingual coding is often a bottleneck in 
MT system development. Unlike other linguistic re- 
sources, like grammars, lexicons have an open-ended 
linear growth and their quality can be directly related 
to their size. For this reason, there is often a mismatch 
between the development time frames of bilingual lex- 
ical resources and other modules. 



2 Basic ideas 

2.1 Template based generation 

We use a bootstrap approach to transfer rule creation. 
An initial hand coded bilingual lexicon is used as a 
basis for defining a set of transfer rule templates, i.e. 
skeletal rules unspecified for words. Subsequently, ap- 
propriate transfer rule templates are associated to new 
word equivalences, on the basis of the morphosyn- 
tactic features of those words, to construct complete 
LTRs. The approach described here shares this under- 
lying template-based bootstrap philosophy with the 
approach described by Q , but differs from it in three 
key respects: the resources it uses, the way templates 
are created and the way LTRs are created from word 
equivalences and templates. LTR templates are also 
akin to tlinks, as described in R] and These works 
describe how to use tlinks for the semi-automatic gen- 
eration of single-word equivalences. However, they do 
not deal with the creation of an inventory of tlink 
types or the generation from multi-word equivalences. 

For the sake of exposition, we use here a simplified 
version of LTRs and templates, showing only words 
(for LTRs) , syntactic categories and indices, the latter 
represented by tuples of subscript lowercase letters. A 
schematic LTR and template are shown in (Q) and (||) , 
respectively. For a description of the MT system and 
the full LTR formalism see || and @. 
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Table 1: Incremental template coverage 
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Given a word equivalence expressing a translation 
pair, our goal is to create a transfer rule directly us- 
able by an MT system. In other words, the goal is to 
associate a transfer pattern, as informally described 
above, to a word equivalence. We describe two ap- 
proaches, the latter of which is an extension of the 
former. 

2.2 The enumerative approach 

The goal of creating templates and using them in gen- 
erating LTRs can be accomplished through the follow- 
ing steps: 

1. Create transfer rule templates: 

(a) Define a set of LTR templates. Each tem- 
plate represents a transfer pattern. The 
template definition task is carried out by 
extracting LTR templates from an initial 
hand-coded bilingual lexicon. This can be 
easily done by simply removing words from 
LTRs, normalizing variables by renaming 
them in some canonical way (so as to avoid 
two instances of the same templates to only 
differ by variable names), then ranking tem- 
plates by frequency, if the application of 
some cutoff is in order. Table [j] shows the 
incremental coverage of the set of templates 
we extracted from our initial hand-coded 
English-Spanish bilingual lexicon. We refer 
to the template creation process described 
here as the enumerative approach to build- 
ing templates. 

(b) Associate a set of constraints to each tem- 
plate. Typically, these are morphosyntac- 
tic constraints on the words to be matched 
against the template. Basically, such con- 
straints ensure that an input word belongs 



to the same lexical category of the tem- 
plate item it has to match. The same goal 
could also be achieved by directly unifying 
a lexical category associated to a word with 
the corresponding template item, instead of 
having separate constraints. 

2. Create transfer rules: 

(a) Given a word equivalence, create an LTR if 
the lexical descriptions of the words in the 
translation pair satisfy all the constraints 
associated with a template (or unify with 
their corresponding items in the template) . 
In that case, an LTR is created by simply 
instantiating the successful template with 
the words in the word equivalence. 

An enumerative approach to template creation 
guarantees an adequate coverage for creating most of 
the LTRs needed in an MT system, as discussed in 
||. A simple LTR automatic generation procedure 
can be implemented by selecting the most significant 
templates in the database. This can be done, as hinted 
above, by counting the occurrences of each template 
in the LTR corpus and choosing those that rank best. 
The top ranking templates are then used to directly 
map input word equivalences onto LTRs. This ap- 
proach was implemented and used, with good results 
(e.g. in an early test run on a 1544 entry word-list 
downloaded from the World Wide Web, LTRs were 
created for 79% of the input word equivalences. Fur- 
ther results are discussed in section 0. 

2.3 The generative approach 

The idea of adding recursion to the template definition 
procedure, thus replacing a finite set of templates with 
an infinite one, was brought about by work on phrasal 
verbs. Phrasal verbs exhibit a larger variability than 
other collocations. One of the problems is that their 
translations are often paraphrases, because a target 
language might lack a direct equivalent to a source 
phrasal verb. Table || illustrates this point (e.g. sit 
through something <-> permanecer hasta la fin de algo). 

A finite set of templates could still be used for 
phrasal verbs, but this would require a much larger 
initial LTR corpus than is necessary for other colloca- 
tions, in order to preserve a high automatic generation 
rate. An alternative solution is based on introducing 
a further level of abstraction, by defining higher level, 
underspecified templates which state bilingual equiva- 
lences in terms of phrasal categories instead of lexical 
categories. Then, a simple grammar is used to map 
such phrasal categories onto sets of lexical categories 
in order to derive completely specified templates. 

Despite the template variability in terms of se- 
quences of syntactic categories, a much higher reg- 
ularity can be found by defining templates in terms of 



constituency. For instance, all the lexical equivalences 
listed for sit in Table || can be reduced to two basic 
patterns in terms of phrasal categories :[] 

(3) VP <-> VP 

(4) VP/NP <-> VP/NP 

where VP/NP represents a verbal phrase with a noun 
phrase gap. A further generalization is that a VP on 
either side of a template tends to be equivalent to a 
phrase of the same type, with the same number and 
type of gaps. We note incidentally that this further 
generalization does not hold for all categories, in terms 
of category identity. For example, an English adjective 
often corresponds to a Spanish prepositional phrase 
(e.g. fashionable «-> de moda, stainless <-> sin tacha). 

The abstraction process consists of partitioning lex- 
ical templates into classes such that each class is iden- 
tified by a phrasal template, in which a group of lex- 
ical categories is replaced by a phrasal category. All 
the lexical templates in a class can be obtained by 
replacing the phrasal category with one of its lexical 
projections. Such replacement can be carried out on a 
purely monolingual basis, by using a simple grammar 
to define constituency. The key requirement on the 
abstraction process is that the resulting abstract tem- 
plate be invariant with respect to lexical replacement, 
i.e. the replacement do not involve any other element 
in the abstract template, beside the replaced phrasal 
category. This restriction amounts to requiring that a 
phrasal category be self-contained in terms of variable 
sharing, i.e. the lexical categories it dominates intro- 
duce no new variables to be shared with items exter- 
nal to the phrase itself; or, if such sharing happens, 
it must be entirely predictable and unambiguous, i.e. 
the variable sharing lexical category must be marked 
in such a way that it can be uniquely identified. 

Once a phrasal template has been defined, and a 
grammar is available for mapping phrasal categories 
onto lexical categories, new, previously unseen lexical 
templates can be derived that were not present in the 
initial class which the phrasal template was abstracted 
over. Depending on the recursivity of the grammars in 
use, an infinite set of lexical templates can be defined 
via a finite set of phrasal template. Therefore, we refer 
to this process as to a generative approach to building 
templates. For example, the templates underlying the 
bilingual entries in (JsJ) allow one to infer the template 
underlying the bilingual entry for (^|). 

(5) Buddha <-> buda 

Wonderland <-> pais de las maravillas 

(6) Halloween «-> vispera del Dia de los Santos 

Namely, from the lexical templates in 0t), the phra- 
sal template in (||) can be inferred. In turn, the new 

x We adopt the convention of using lowercase labels 
for lexical categories and uppercase labels for phrasal 
categories. 



Phrasal verb equivalences: 
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Glosses for Spanish: 



where: 
En. sth 
En. sb 



Sp. algo 
Sp. algn 



a. ponerse comodo 
put oneself comfortable 

b. sentarse 
sit oneself 

c. posar para algo 
pose for sth 

d. sustituir algn 
replace sb 

e. participar como observador 
take part as observer 

f. permanecer hasta la fin 



g. esperar 
wait 

h. velar algn 
watch over sb 

i. incorporarse 
raise oneself 

j. sentar algn 

sit sb 
k. incorporar algn 

raise sb 
1. no participar 

do not take part 

something 
somebody 



en algo 
in sth 



en algo 
in sth 
de algo 



until the end of sth 



Table 2: Some phrasal verb equivalences for the verb sit, and associated glosses for Spanish. 
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Figure 1: System architecture. 



lexical template in (||) can be derived from (||), pro- 
vided that the relevant grammar licenses the projec- 
tion of a phrasal category NBAR onto an appropriate 
sequence of lexical categories. 

(7) n a <-> n a 

n a <-► n a & p a , b & d b k n b 

(8) n a NBAR a 

(9) n a <-> n a & p a , b & d b k n b k p biC & d c & n c 

The overall architecture of the generative approach 
is shown in Figure 0. The idea of the whole process 
is to exploit monolingual regularities to account for 
a non-compositional bilingual equivalence. In a non- 
compositional equivalence, direct correspondences be- 
tween lexical items on either side cannot be estab- 
lished. The two sides are only equivalent as wholes. 
However, a non-compositional equivalence can be ac- 
counted for by decomposing it into a phrasal bilingual 
equivalence and monolingual mappings from phrasal 
to lexical categories. 

Implementing a template grammar allows one to 
obtain an adequate template coverage while requir- 
ing only a small initial LTR corpus. In our system 
the template abstraction task was performed manu- 
ally. Although the implementation of an automatic 
template abstraction procedure could be foreseen, the 
high reliability required of templates demands that 
a strict human control still be placed at some point 
of the template abstraction and grammar definition 
phases. It is also worth pointing that, by our ex- 
perience, the grammar development task is not very 
labour intensive. Such grammars have to perform a 



very limited task, and deal with a restricted and con- 
trolled input. Basically, they only have to account for 
the constituent structure of very simple and syntac- 
tically ordinary phrases. In our case, the grammar 
development time was usually measured in hours. 

3 Implementation 

In this section the generation of an LTR from a phrasal 
template is described. We illustrate the procedure 
with the aid of a worked example. We show how the 
LTR in (111]) is generated from the word equivalence 



in (10) 



(10) sit in on sth <-> 

participar como observador en algo 

(11) sit:±v a ^, c & in:adv c k on:p aj( ; <-> 
participar : iv a b e & como :p e ,/ k 
observador :ny k en:p aj£ ; 

Pragmatic reasons induced us to take a hybrid ap- 
proach to the template grammar construction task, 
combining the enumerative and generative approaches 
to build templates. It turned out that templates show 
a larger variability on the Spanish side than on the 
English side. Table || is a good example of this point. 
This fact might be related to a larger use of concise 
and productive phrasal verb patterns in English, or 
perhaps to the fact that our word equivalence cod- 
ing was driven by English collocations, thus involv- 
ing the presence of paraphrases on the Spanish side 
only. In any case it appeared that the enumerative ap- 
proach was sufficient to adequately cover the English 
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Figure 2: Template right hand side generation tree. 



side without resulting in a proliferation of templates. 
Hence, we constructed a set of templates in which only 
the Spanish side contained a phrasal category, while 
the English side was fully specified. For example: 

(12) iv 0)6)C & adv c & p a , d <-> VP a , h /NP d 

For each such partially specified template, the Span- 
ish side has to be generated. 

The selection of a phrasal template candidate is 
performed by doing a lexical lookup for the source 
words and matching their morphosyntactic represen- 
tations with the corresponding left hand side tem- 
plate items. This can be done directly by unification 
or by associating a set of constraints to the phrasal 
template. In our implementation, constraints are ex- 
pressed by Prolog goals taking morphosyntactic de- 
scriptions as arguments. For example, the phrasal 
template in ( |l2| ) is selected as a candidate for our in- 
put translation pair in (10), as the lexical descriptions 
of sit, in and on match the categories iv, adv and p, 
respectively. A sequence of words can match several 
phrasal templates, e.g. in our specific example the fol- 
lowing candidate is also selected, since in can also be 
a noun (e.g. the ins and outs of a problem): 



(13) tv a ,6 )C & n c 



Vc,d 



VPa,fe/NP d 



Given a candidate phrasal template, the core of 
the generation procedure is a call to a target language 
grammar, Spanish in this case, which parses the target 
input words using the given phrasal category (with its 
associated indices) as its initial symbol. 

A phrasal template may disjunctively specify sev- 
eral phrasal categories as initial symbols (or, equiva- 
lently, different phrasal templates may share the same 
English side, while specifying different initial sym- 
bols). For instance, English adjectives may be equiv- 
alent to either a Spanish adjectival phrase or preposi- 
tional phrase, as already mentioned. 



Figure 



shows the result of parsing the Spanish 



input words of our example, using the initial symbol 



in the phrasal template in (|lj). Each node shows the 
assigned syntactic category, along with the indices, 
either specified in the initial symbol or instantiated 
during parsing. 

The pre-terminal categories in the parse tree, along 
with the input words, are used to build the following 
LTR right hand side for the final LTR: 

(14) participar :±v a & como :p e j k 
observador :ny k en:p a c ; 

Finally, by instantiating the left hand side of the 
template with the input English words and replacing 
the right hand side phrasal category in the phrasal 
template (O) with the right hand side in (|14|), the 
LTR in (yps obtained. 

Again, several output LTRs can be generated for 
the same input words. For example, the following 
LTRs are also generated for our example: 

(15) sit:iv a i,,c & in:adv c k on:p aj( ; <-> 
participar : iv a j, k como :p a , e k 
observador :n e & en:p aj( j 

(16) sit : iv a ,b,c & in: adv c k on:p QjC i <-> 
participar : iv a h. e & como :p e j k 
observador :ny k en:p/ jt / 

In (|l^) the indices show that como is analyzed as 
a modifier of participar, instead of a complement. In 
( |l6|) the preposition en modifies observador, instead 
of participar. In our case, since we are interested in 
translating only from English to Spanish, we would 
keep only one of the candidate LTRs shown above, as 
they all share the same English side. However, given 
that the LTRs at hand are bidirectional, if one were 
interested in translating from Spanish to English, one 
might want to keep all the entries, in order to cover 
different syntactic analyses. 

The last step is the validation of LTRs by lexicog- 
raphers, which exclusively consists of removing un- 
wanted entries. Lexicographers use their linguistic in- 
tuition and knowledge of the syntactic representations 
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Table 3: Overview of system performance. 



used in LTRs, in order to make a choice among sev- 
eral competing LTRs for a given translation pair, or 
to to check the correctness of the analysis underlying 
an LTR. Information that they are typically expected 
to check is the way coindexing is performed (e.g. for 
prepositions, in order to check that they be attached 
to the correct item) and the syntactic categories as- 
signed to words, most crucially when unknown words 
are involved. In our implementation lexicographers 
are helped in the task by messages that the system 
associates to candidate LTRs, in order to signal, for 
instance, the presence of lexically unknown words or 
lexical ambiguities which are potential sources of er- 
rors (for instance, verbs which are both transitive and 
intransitive). The files output by the system tend to 
be self-contained in terms of information needed by 
lexicographers to perform their task, and usually lex- 
icographers do not need access to any extra linguistic 
resources. We finally note that the validation step 
becomes particularly crucial if the input translation 
pairs are automatically acquired from resources like 
bilingual corpora, in order to filter out LTRs created 
from noisy bilingual equivalences. 

4 Performance 

Results of this approach are shown in Table [| Columns 
are to be interpreted as follows: 

File : The type of words or collocations in the pro- 
cessed file. 

In : The number of input word equivalences. 

Out : The number of generated LTRs. 

Val : The number of LTRs validated by the lexicog- 
raphers and thus added to the transfer lexicon. 

InOut : The number of input word equivalences (In) 
for which some output (Out) was provided. This 
value is usually lower than the value of Out be- 
cause more than one LTR can be created for the 
same word equivalence. Therefore, more than 
one element of Out can correspond to one ele- 
ment of InOut. 



InVal : The number of input word equivalences (In) 
for which some LTR was validated (Val) by the 
lexicographers. 

% : The success rate, obtained by dividing the value 
of InVal by the value of In. By using the value 
of InVal instead of the value Val we factor out 
the extra valid LTRs that can be created for a 
given input word equivalence, in addition to the 
first one. 

The listed files are sorted according to the chrono- 
logical order in which they were processed, and di- 
vided according to the methodology in use. One of 
the most common reasons for failure is the presence 
of unknown words on the English side. When the in- 
put contains unknown words, we block generation. In 
contrast unknown words are accepted on the Span- 
ish side, matching any possible lexical category. This 
treatment of Spanish unknown words is one of the 
reasons that explains the higher value in Out than 
in InOut. Genuine syntactic ambiguity is the other 
main reason for such difference in values. 

In terms of speed, we ran a test by evaluating the 
development time of the file named 'Phrasal verbs - 
batch 1' in Table |[ We had a lexicographer timing 
three activities: coding translation pairs, revising au- 
tomatically generated LTRs, manually coding LTRs 
for the translation pairs that failed in automatic gen- 
eration. The results are shown in Table |[ 

5 Conclusion 

The described methodology makes LTR coding con- 
siderably faster. According to our test, revising au- 
tomatically generated LTR files is about 8 times as 
fast as manually coding LTRs. If the manual coding 
of word equivalences is also counted in the automatic 
generation process, the process is still about 3 times 
as fast as the manual coding of complete LTRs. 

Besides speed, this methodology guarantees the 
syntactic correctness of the output (provided that the 
templates are syntactically correct, of course). The 
validation procedure only requires removing unwanted 



Activity N. of items Items per hour Time for 100 items 

Coding translation pairs 2340 31.25 3h 12m 

Revising LTRs 1416 (validated) 50.57 lh 59m 

Manually coding LTRs 926 6.25 16h 00m 



Table 4: Speed test results. 



LTRs, with no further editing intervention. Also, more 
control over a transfer rule database is provided, as 
each LTR can be associated with a template. In this 
way testing, debugging, and maintenance in general 
are easier and more effective, as these processes can 
be performed on templates rather than on LTRs. 

Although the described methodology can hardly 
aspire to completeness, compared to manual coding, 
it can be integrated with a manual coding phase, for 
the translation pairs that fail to generate automati- 
cally. The gain in terms of labour effort is still pro- 
portional to the success rate of the generation proce- 
dure, when compared to an entirely manual coding. 
Moreover, there is one sense in which automatic gen- 
eration is more complete than manual coding, namely 
in the generation of multiple entries for a translation 
pair. Lexicographers tend to code one LTR per trans- 
lation pair, whereas the range of candidates proposed 
by automatic generation can make them aware of valid 
alternatives analyses unnoticed by them. 

Lexicographic work is easier, as little technical 
knowledge is required of lexicographers. Full knowl- 
edge of the formalism in use for LTRs is only required 
in the initial phase of LTR corpus constructions. Once 
a template generation procedure is in place, lexicogra- 
phers only need a passive knowledge of the formalism, 
i.e. they must be able to read and understand LTRs, 
but not to write LTRs. 

Input word equivalences are expressed in plain nat- 
ural language. This makes them amenable to acquisi- 
tion from corpora or MRDs. If manual coding of word 
equivalences is necessary, only bilingual speaking com- 
petence is required of lexicographers. No further lin- 
guistic background or familiarity with any formalism 
is necessary. 

The bootstrap approach makes this methodology 
more suitable for the scaling up of large scale MT sys- 
tems, than for rapid development of prototypes. The 
knowledge acquired in prototype development is used 
in developing a full-fledged system, which is often the 
most critical phase for MT systems. The adoption of 
this methodology is also profitable in porting an MT 
system to a different language pair or in developing a 
multilingual MT system: the methodology is applica- 
ble to any language pair, and the linguistic knowledge 
can also be re- used to some extent, depending on the 
similarity between the languages at hand. 

References 



[1] Ann Copestake, Bernie Jones, Antonio Sanfilippo, 
Horacio Rodriguez, Piek Vossen, Simonetta Mon- 
temagni, and E. Marinai. Multilingual lexical 
representation. Technical Report 253, University 
of Cambridge Computer Laboratory, Cambridge, 
UK, 1992. 

[2] Ann Copestake and Antonio Sanfilippo. Multilin- 
gual lexical representation. In Proceedings of the 
AAAI Spring Symposium on Building Lexicons for 
Machine Translation, Stanford, California, USA, 
1993. 

[3] Pascale Fung. A statistical view on bilingual lex- 
icon extraction: From parallel corpora to non- 
parallel corpora. In Proceedings of the Third Con- 
ference of the Association for Machine Transla- 
tion in the Americas (AMTA-98), pages 1-17, 
Langhorne, Pennsylvania, USA, 1998. 

[4] I. Dan Melamed. Empirical methods for MT lex- 
icon development. In Proceedings of the Third 
Conference of the Association for Machine Trans- 
lation in the Americas (AMTA-98), pages 18-30, 
Langhorne, Pennsylvania, USA, 1998. 

[5] Fred Popowich, Davide Turcato, Olivier Laurens, 
Paul McFetridge, J. Devlan Nicholson, Patrick 
McGivern, Maricela Corzo-Pena, Lisa Pidruchncy, 
and Scott MacDonald. A lexicalist approach to 
the translation of colloquial text. In Proceedings 
of the 7th International Conference on Theoretical 
and Methodological Issues in Machine Translation, 
pages 76-86, Santa Fe, New Mexico, USA, 1997. 

[6] Davide Turcato. Automatically creating bilingual 
lexicons for Machine Translation from bilingual 
text. In Proceedings of the 17th International 
Conference on Computational Linguistics and 36th 
Annual Meeting of the Association for Com- 
putational Linguistics (COLIN G-ACL'98), pages 
1299-1306, Montreal, Quebec, Canada, 1998. 

[7] Davide Turcato, Olivier Laurens, Paul 
McFetridge, and Fred Popowich. Inflectional 
information in transfer for lexicalist MT. In 
Proceedings of the International Conference 
'Recent Advances in Natural Language Process- 
ing' (RANLP-97), pages 98-103, Tzigov Chark, 
Bulgaria, 1997. 



